Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎮 Reinforcement Learning
Q-Learning, Policy Gradients, OpenAI Gym, Reward Functions
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
122405
posts in
643.3
ms
Goal-Conditioned
Reinforcement Learning from Sub-Optimal Data on
Metric
Spaces
arxiv.org
·
14h
📊
Optimization
check out this
article
on Reinforcement Learning with R:
Origins
, Real-Life Applications, and Practical Implementation
dev.to
·
2d
·
Discuss:
DEV
📊
Statistical Computing
Mitigating
Reward Hacking in
RLHF
via Bayesian Non-negative Reward Modeling
arxiv.org
·
14h
🎲
Bayesian statistics
Show HN:
Fighting
the War Against
Expensive
Reinforcement Learning
cadenza-landing-qtu7gbjwb-akshparekh123-3457s-projects.vercel.app
·
12h
·
Discuss:
Hacker News
🤖
Machine learning
Optimizing post-disaster road
restoration
with reinforcement learning: A
traveler-behavior-aware
approach
sciencedirect.com
·
3h
📊
Optimization
A
Conceptual
Framework for Exploration
Hacking
lesswrong.com
·
3h
🎲
Bayesian statistics
A training
principle
for
drifting
models
breno.bearblog.dev
·
8h
🤖
Machine learning
Recursive
self-improvement
from AI models
marginalrevolution.com
·
2d
·
Discuss:
Hacker News
📊
Optimization
Generalized
Lanczos
method for systematic optimization of neural-network quantum states
link.aps.org
·
9h
📊
Optimization
Learning Optimization Tools
trendhunter.com
·
2d
📊
Optimization
How to
Leverage
Explainable
AI for Better Business Decisions
towardsdatascience.com
·
4h
🤖
Machine learning
Robotics
Motion Learning: Training Linked Robot Arms with
Kuramoto
Models
hackernoon.com
·
1d
🤖
Machine learning
Researchers propose a self-distillation fix for ‘
catastrophic
forgetting
’ in LLMs
infoworld.com
·
9h
📊
Optimization
EyesOff
: Why Some Models
Quantize
Better Than Others
ym2132.github.io
·
20h
·
Discuss:
Hacker News
🤖
Machine learning
ashworks1706/rlhf-from-scratch
: A theoretical and practical deep dive into Reinforcement Learning with Human Feedback and it’s applications in Large Language Models from scratch.
github.com
·
2d
·
Discuss:
Hacker News
📊
Statistical Computing
A
masterclass
in AI security
operations
redcanary.com
·
5h
🤖
Machine learning
In defense of
wasting
time
fastcompany.com
·
18m
🧘
Digital Minimalism
Feedback
Control for Computer Systems
janert.org
·
12h
📊
Statistical Computing
AI Beyond The
Chatbot
: The New Value
Chain
seekingalpha.com
·
6h
🤖
Machine learning
A multi-agent reinforcement learning approach to autonomous aircraft
taxiing
with
taxiing
time, fuel consumption, and
emission
optimization
sciencedirect.com
·
1d
📊
Optimization
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help